Nature Methods — Latest Matching Preprints

1

Privacy-Preserving Matching for Federated Causal Inference in Multicentre Patient Cohorts

Gusinow, R.; Morgan, A. S.; Canziani, L. M.; Zeitlin, J.; Kim, M.; Gentilotti, E.; Ghosn, J.; Florence, A.-M.; Tami, A.; Toschi, A.; Palacios-Baena, Z. R.; Tacconelli, E.; Hasenauer, J.

2026-07-19 epidemiology 10.64898/2026.07.16.26358171 medRxiv

Top 2%

5.5%

Show abstract

Causal effect estimates can often be biased in clinical and epidemiological studies as patient cohorts frequently exhibit substantial covariate imbalances between treated and control groups, often amplified in multicentre studies due to heterogeneous recruitment, clinical practice, and case mix. Covariate balancing methods are therefore essential for valid causal inference. However, their application becomes challenging when data are distributed across cohorts and cannot be pooled because of privacy, legal, or institutional constraints, leaving a gap in practical methods for causal effect estimation in federated and imbalanced clinical data settings. We develop a privacy-preserving framework for covariate balancing and causal effect estimation across distributed data providers, combining federated aggregation with differential privacy to enable propensity score subclassification and matching without sharing individual-level records. Matching relies on non-disclosive quantities and differentially private distance evaluation, and the resulting matched subsets remain local to each server. Balance can be assessed through federated diagnostics and privacy-preserving visualisations, and we provide secure estimators for average treatment effects with associated uncertainty quantification. We implement this framework in the DataSHIELD federated analysis platform via 2 R packages. In simulations, we demonstrate agreement between federated and centralised analyses in the absence of privacy noise and quantify the bias--variance trade-offs induced by differential privacy. We illustrate applicability in two multinational settings-a Long COVID cohort and very preterm birth cohorts-showing that the approach enables practical causal analyses under real-world data protection constraints. The DataSHIELD packages are available on Github. Additional methodological details are provided in the Supplementary Material.

2

FootNet: A Multi-View Smartphone Dataset and Four-Model Benchmark for Clinical Foot Segmentation

Vijay, A.; Prabhune, A.; Srihari, V. R.; Rayampalli, A.

2026-07-17 health informatics 10.64898/2026.07.15.26358117 medRxiv

Top 4%

1.9%

Show abstract

We present FootNet, a 453-image multi-view smartphone foot dataset for binary foot segmentation, with expertannotated masks across six anatomical views (dorsal, medial, and plantar, both left and right). We benchmark four segmentation models under a controlled protocol: U-Net with a MobileNetV2 encoder achieves the best performance (IoU 0.9268, Dice 0.9608, 95 % CI [0.9209, 0.9320]); DeepLabV3 with MobileNetV3-Large scores IoU 0.8984 (Dice 0.9449); UNet++ with MobileNetV2 scores IoU 0.8913 (Dice 0.9391); and SAM ViT-B with oracle boundingbox prompt scores IoU 0.9219 on the matched 191-image subset. Bonferroni-corrected Wilcoxon signed-rank tests (k = 6 comparisons) show U-Net significantly outperforms DeepLab (p < 0.001, r = 0.638) and SAM ViT-B with oracle boundingbox (p = 0.005, r = 0.202); UNet++ does not significantly differ from DeepLab (p = 0.062). Connected-component postprocessing yields negligible benefit (mean {triangleup}IoU = +0.0003, 12 of 453 images improved). The extended dataset is available upon request

3

The Variance-Stabilizing Transformation for the Poisson Rate Ratio: Closed-Form Confidence Intervals

Ng, S.-P.

2026-07-18 epidemiology 10.64898/2026.07.16.26358255 medRxiv

Top 4%

1.7%

Show abstract

The incidence rate ratio R is the standard measure for comparing event rates in clinical trials and epidemiology. In vaccine trials, the vaccine efficacy is VE = 1 - R. When events are rare, the two arm counts are Poisson. The estimator of R is heteroskedastic: its sampling variance changes with the data. So no fixed-width interval covers correctly everywhere. The usual log-Wald interval is undefined at zero events and covers poorly at small counts. Early vaccine and drug-safety readouts fall in exactly this regime. We show that a single reparameterization collapses this bivariate problem to an effective one-parameter family with a quadratic variance function, whose variance-stabilizing transformation is 2 arcsinh(sqrt(R)). The reduction yields a closed-form confidence interval for R. Its two leading errors, a curvature bias and the variability of the estimated scale, each admit a closed-form correction with no tuning constants. In a Monte Carlo study of our seven arcsinh variants and five competitors, the +Curve+Stu variant covers within 0.002 of the nominal 0.95 for about 50 control and 5 treatment events. Its width is on par with the best competitor. It avoids the conservatism and zero-count breakdown of log-Wald and MOVER. For moderate counts, we recommend this interval; for sparser data, our Bar-Lev and Enis count-shift variant is more robust. The result is a ready-to-use, closed-form interval for the low-count regime. We illustrate it on early Covid-19 vaccine-efficacy readouts and provide reference implementations in R and Python.

4

From amplicon to antigen: a quantified transmission map that nominates multi-antigen antibody-drug-conjugate co-target sets across cancer types

Lam, J. M.; Walker-Samuel, S.; Pennycuick, A.

2026-07-16 oncology 10.64898/2026.07.13.26357987 medRxiv

Top 6%

0.9%

Show abstract

Somatic copy-number amplification is pervasive in cancer, and the genes it carries are candidate drug targets - but only those whose amplification is transmitted to accessible surface protein can be reached by an antibody-drug conjugate (ADC). We build an integrated map of copy-number-to-protein transmission across six tumour types and ask, for every amplified gene, whether its dosage reaches the surface. Copy number transmits to mRNA (median per-gene r = 0.21) but is attenuated at the protein level in 85% of genes, and the mRNA ranking is largely preserved to protein (rho = 0.70); the ranking is set principally at the chromatin/transcription step - among directly measured regulatory inputs, promoter DNA methylation and tumour chromatin accessibility each explain about an order of magnitude more of the transmission variance than gene structure, and do so complementarily. Critically, transmissibility is a stable, gene-intrinsic property: it is predictable from gene properties alone, with no proteomic input, at a leave-gene-out rank correlation of 0.52 (R2 = 0.29); it is not positional (holding out whole chromosome arms changes accuracy by 0.001); and it transfers across lineages (Kendall W = 0.97 across leave-one-lineage-out refits). This licenses a predictor that nominates surface targets in cancer types that lack a tissue-referenced proteome, combining direct protein measurement where it is available with prediction where it is not. Requiring co-elevation on a recurrent amplicon with measured transmissibility and an accessible extracellular ectodomain nominates 22 surface antigens on 18 distinct recurrent amplicons across four cancer types (renal, endometrial and both lung subtypes) - for example ITGB8+TSPAN13+TTYH3 on lung 7p, NCSTN+HSD17B7+MPZL1 on 1q (recurrent in several types), the transferrin receptor TFRC on squamous 3q, and FZD1 on clear-cell renal 7q; 21 of the 22 are non-driver passengers and 10 are confirmed on the experimental Cell Surface Protein Atlas. In single malignant cells, against a null that controls for per-cell sequencing depth, the co-detected constructs sit at a modest 1.05-1.45x above independence (p < 0.001, donor-block bootstrap intervals clear of 1.0), and at binding-relevant thresholds the normal-tissue co-expression collapses - so an avidity AND-gate that binds stably only where the antigens co-occur would spare normal cells that carry only one. Observed transmissibility itself transfers strongly between the two lung subtypes ({rho} = 0.88) and remains positive across distant lineages, consistent with the shared cell-of-origin regulation the map implies. Single-cell co-detection is demonstrated wherever a malignant single-cell atlas exists (both lung subtypes and glioblastoma - the latter entirely from prediction, using no GBM surface-abundance measurement); the remaining cohorts are nominated on the same genetic and topological evidence. The result is a pan-cancer, confidence-tiered catalogue of multi-antigen ADC co-target sets with a concrete plan to test them.

5

A ReAct Agentic AI System for Natural Language Querying and Statistical Analysis of The Cancer Genome Atlas Clinical Data

Korutla, R.; Amal, S.

2026-07-17 health informatics 10.64898/2026.07.15.26358188 medRxiv

Top 6%

0.8%

Show abstract

The Cancer Genome Atlas (TCGA) holds clinical data for over 11,000 patients across 33 cancer types, but access is hard because of complex file structures, heterogeneous formats, and the need for programming. We present an agentic system for natural language querying and statistical analysis of TCGA clinical data. The system uses a large language model as an autonomous ReAct agent that selects from eight computational tools, including data extraction, descriptive statistics, Kaplan-Meier survival analysis with log-rank tests, hypothesis testing, and verification against the curated TCGA Pan-Cancer Clinical Data Resource (CDR). The agent reasons about intermediate results, adapts its approach, and returns clinically contextualized responses with source attribution and auditable traces. We introduce TCGA-Agent-Bench, 440 queries across five difficulty tiers with ground truth from the independently curated TCGA-CDR, evaluated with dual metrics of numerical accuracy and clinical completeness. The system achieves 93.4% overall accuracy (100% single-patient lookups, 99.1% cohort statistics, 92.8% comparative analyses), outperforming a fixed rule-based pipeline (87.1%), a single-pass LLM (81.8%), and retrieval-augmented generation (66.9% on a subset). Most of the benchmark is answerable from the CDR alone, so we locate the extraction layer's value in fields the CDR lacks (drug treatments, TNM components, biomarkers, biospecimen metadata): on 26 queries targeting these, the full system answers 100% versus 3.8% for CDR-only. Ablations show the reasoning loop is most impactful (+9.1% accuracy, +22.0 completeness points). A tool-based agentic architecture enables accurate, auditable analysis of clinical repositories, with value driven by tool design and recovered fields rather than model scale.

6

CuGen: A GPU-accelerated framework for large-scale genomics

Kiiskinen, T.; Richland, J.; Wang, W.; Lu, W. S.; Balasubramanian, N.; Hastie, T.; Tibshirani, R.; Rivas, M. A.

2026-07-17 genetic and genomic medicine 10.64898/2026.07.15.26358178 medRxiv

Top 6%

0.8%

Show abstract

Biobank-scale genomic analyses remain computationally expensive, CPU-bound workflows, particularly when adjusting for confounding. Here, we present CuGen, a GPU-accelerated framework for large-scale genomics. CuGen uses UltraLasso, a novel hierarchical application of univariate-guided sparse regression (uniLasso), to select a compact, phenotype-informed active set of fewer than 30,000 variants. This achieves robust leave-one-chromosome-out (LOCO) confounding control, enabling both downstream GWAS and in-sample fine-mapping. Additionally, we introduce the .cugen file format, a genotype representation designed for memory-optimized, high-throughput streaming and random access on GPU hardware. Building on this substrate, we provide a general GPU-accelerated genomics toolkit handling polygenic prediction, data manipulation, quality control, analysis, and visualization. We demonstrate CuGen's efficacy in the UK Biobank with up to 408,624 individuals, where the full GWAS pipeline and fine-mapping against 6.8 million imputed variants completes in approximately 10 minutes on a single high-throughput GPU with 80 GB of memory. The pipeline scales efficiently to massive phenome-wide analyses with sublinear resource consumption.

7

Gradient-guided adapter merging for neuroimaging vision-language models

Bit, S.; Guney, O. B.; Jia, S.; Kolachalama, V. B.

2026-07-21 health informatics 10.64898/2026.07.18.26358397 medRxiv

Top 9%

0.4%

Show abstract

Automated interpretation of neuroimaging studies requires simultaneous assessment of multiple imaging evidence variables, each tied to distinct anatomical structures. Vision-language models (VLMs) offer a unified framework for multi-task analysis, but adapting pre-trained VLMs remains challenging. Full fine-tuning is computationally prohibitive, and joint multi-task training requires simultaneous access to all task data, which is often infeasible in clinical settings. Although model merging enables multi-task composition without joint re-training, existing methods focus on post-hoc algorithms with limited extension to VLMs and minimal application to neuroimaging. Here, we present GRadient-guided Adapter Merging (GRAM), a layer-selective low-rank adaptation (LoRA)-based fine-tuning and merging framework for multi-task neuroimaging visual question-answering (VQA). GRAM uses a gradient ratio that contrasts class-specific gradients to identify task-discriminative layers, and applies subspace-constrained projected gradient descent to restrict LoRA updates to directions consistent with the geometry of the pre-trained model. We leveraged a structured VQA benchmark, developed from the National Alzheimer's Coordinating Center (NACC) dataset, that pairs multi-sequence brain MRI studies with question-answer pairs across clinically relevant imaging evidence variables. Experiments on the VQA benchmark showed that GRAM outperformed or matched all-layer LoRA fine-tuning and a standard merging baseline while reducing inter-task interference during merging, and approached or surpassed the performance of joint multi-task training without joint re-training.

8

ReCo: a self-configuring and self-extending agentic framework for biomedical research

Tzanis, E.; Klontzas, M. E.

2026-07-16 health informatics 10.64898/2026.07.14.26358025 medRxiv

Top 12%

0.2%

Show abstract

This study presents ReCo (Research Cosmos), a self-configuring and self-extending agentic research framework for the biomedical domain. ReCo is orchestrated by a large language model that interacts with native computing tools, bundled Model Context Protocol (MCP) servers, structured skills, persistent project memory, and a desktop interface. Its bundled MCP servers provide biomedical analysis capabilities while serving as implementation paradigms for integrating new computational and AI frameworks. Structured skills encode procedures for environment configuration and framework ingestion, enabling ReCo to inspect repositories, manuscripts, or local codebases; identify dependencies and execution patterns; create isolated runtime environments; design and implement MCP interfaces. Self-extension was evaluated using five heterogeneous systems: the Merlin computed tomography foundation model, MAISI-v2 medical image synthesis framework, asari liquid chromatography-mass spectrometry workflow, DosimeTron agentic radiation-dosimetry platform, and Orthanc DICOM server. ReCo successfully operationalized all five systems and completed predefined functional evaluations. Re-hosted DosimeTron outputs demonstrated near-perfect agreement with the reference pipeline across 651 organ observations (Pearson correlation and Lin concordance correlation coefficient, 0.99999; mean absolute percentage difference, 0.37%). Notably, ReCo configured Orthanc as a PACS-like coordination layer, integrated it with DosimeTron, Merlin, and TotalSegmentator, and orchestrated data retrieval, analysis, and return of valid DICOM RTSTRUCT, RTDOSE, and Structured Report. ReCo provides a unified environment for configuring, documenting, and operationalizing heterogeneous biomedical frameworks, reducing technical barriers to the adoption and integration of emerging computational and AI methods. The official open-source ReCo GitHub repository is available at: https://github.com/eltzanis/ReCo

9

Efficient stochastic epidemic simulation via the Sellke construction

van Boven, M.; Bootsma, M. C.

2026-07-17 epidemiology 10.64898/2026.07.16.26358219 medRxiv

Top 13%

0.2%

Show abstract

Stochastic epidemic models are a cornerstone of infectious disease epidemiology and are often used to study intervention scenarios. However, large run-to-run variability can make intervention effects difficult to estimate precisely. We revisit the epidemic Sellke construction, which assigns each individual an infection threshold for the cumulative infection hazard such that, conditional on the thresholds, the epidemic trajectory becomes deterministic. This enables coupling of simulations with and without an intervention, yielding low-variance effect estimates even when outcomes such as final size or peak incidence vary widely between runs. We develop an exact, event-driven implementation that maintains infection and recovery events in priority queues. Cumulative infection-hazard updates require O(log N) time per event, yielding overall complexity O(Elog N) for E events in a population of size N. The implementation achieves computational performance comparable to the classical Gillespie algorithm while naturally accommodating non-Markovian infectious periods and complex infectiousness profiles. We illustrate the approach using distance-dependent spread of avian influenza between poultry farms in the Netherlands and a multilayer population with households, schools, and workplaces. In both examples, coupling enables efficient within-run comparisons of intervention scenarios across stochastic realisations.

10

Learned ultrasound segmentation and deformable CT fusion for augmented reality endovascular surgery

Dillon, T. M.; Quevedo Moreno, D.; Rutherford, E. K.; Ayers, B.; Salomon, B.; Kubi, B.; Thomas, J.; Roche, E.

2026-07-17 cardiovascular medicine 10.64898/2026.07.15.26358084 medRxiv

Top 13%

0.1%

Show abstract

Minimally invasive endovascular procedures offer reduced surgical trauma, shorter recovery times, and improved outcomes, but rely on 2D fluoroscopic X-ray imaging, which provides limited depth perception and exposes patients and clinicians to ionizing radiation. Here we present an augmented reality (AR) system that fuses intravascular ultrasound (IVUS) and electromagnetic (EM) position tracking with preoperative computed tomography (CT) to produce an anatomically accurate, deformation-corrected navigational reference. A robotic device performs ECG-gated pullback of the IVUS probe, capturing 4D aortic motion across the cardiac cycle. We introduce a deep learning architecture for extracting vascular lumen boundaries and side-branch orifices from artifact-prone IVUS streams, and a semantically driven non-rigid CT-IVUS fusion pipeline robust to false positive landmarks. We evaluate the platform with trained surgeons in benchtop phantom studies and in-vivo ovine models, and demonstrate its application to fenestrated endovascular aneurysm repair (FEVAR). Compared to fluoroscopy alone, AR guidance significantly reduces cannulation time, radiation exposure, and cognitive workload, while improving procedural efficiency and safety. Our IVUS-EM and CT aortic datasets are released open source.

11

Identification of Persistent Radiomics Feature Co-occurrence Across Diverse Tissue Types and Individuals: A Network-Based Analysis of the RADAPT CT Atlas

Amiri, S.; Afshar, P.; Rohban, M. H.

2026-07-19 radiology and imaging 10.64898/2026.07.17.26358252 medRxiv

Top 14%

0.1%

Show abstract

Objectives. Radiomics pipelines extract hundreds of quantitative features that are widely known to be redundant, but the structure of this redundancy is usually treated as a per-dataset nuisance to be pruned away. We tested the alternative hypothesis that a substantial number of feature-feature correlations are universal: they persist across patients and across anatomically distinct structures because they reflect shared mathematical and image-statistical properties of how the image is summarised, rather than properties of the tissue being imaged. Materials and Methods. We re-analysed the publicly available Radiomics Atlas Dataset of normal Abdominal and Pelvic CT (RADAPT), restricting the analysis to the 526 non-contrast-enhanced examinations of the 531-subject atlas and to the 107 original (non-filtered) PyRadiomics features. The 53 segmented structures were grouped into four broad anatomical categories -- bones, muscles, vessels, and parenchymal organs. RADAPT is distributed as one Excel file per structure, with patients as rows and features as columns. Within each structure file we z-score-normalised every feature across patients, computed the absolute Spearman correlation matrix, and retained edges with |{rho}| [≥] {tau} for {tau} in {0.70, 0.80, 0.90}. We then intersected the edge sets across all structure files to obtain a "universal" correlation graph, in which an edge survives only if it exceeds the threshold in every structure (each estimated across the full patient sample). Stable feature communities were defined as the maximal cliques of this graph. Robustness to patient sampling was tested by repeating the entire pipeline on five independent random splits of each file into two patient halves (10 sub-cohorts per threshold), and the implementation was independently reproduced in R. Results. Despite the strictness of the global-intersection criterion, 34, 24, and 14 stable feature communities survived at {tau} = 0.70, 0.80, and 0.90 respectively, with the largest cliques containing six members at {tau} = 0.70 and {tau} = 0.80 and five members at {tau} = 0.90. The community structure was clearly interpretable: separate cliques captured (i) variance-like intensity dispersion, (ii) long-run / low-frequency (coarse) texture, (iii) high gray-level texture, (iv) low gray-level texture, (v) volume and surface shape, and (vi) local-homogeneity and energy/entropy duals. On random-half resampling the exact-match recovery rate of these communities was 81.5 %, 86.7 %, and 80.7 % across the three thresholds; departures from exact recovery were almost always a single boundary feature added or dropped, consistent with finite-sample fluctuation of near-threshold edges rather than structural instability. The R re-implementation reproduced the Python results exactly. Conclusion. A substantial portion of radiomics feature collinearity is universal across patients and tissues. We distinguish two layers within it: trivial near-algebraic duals that are universal by construction, and non-trivial cross-matrix-family communities that are the genuine empirical finding. Together they provide an interpretable, definition-grounded basis for aggressive dimensionality reduction, for retrospectively reconciling apparently different feature selections in the literature, and for moving radiomics pipelines toward organ-agnostic, more reproducible models. Clinical relevance statement. Selecting a single representative feature from each universal community shrinks the original-feature space by roughly an order of magnitude without sacrificing biologically distinct information. For example, the five variance-family members (first-order Variance, GLCM SumSquares, GLCM ClusterTendency, GLDM and GLRLM GrayLevelVariance) can be replaced by a single representative, removing redundant degrees of freedom that would otherwise inflate model variance; and labelling each retained feature by its community lets two studies that selected different variance-family names be recognised as having found the same signal, simplifying model development and improving cross-cohort generalisability in clinical CT workflows.

12

CRISPR RNA-independent activation of Cas12a

Iwe, I. A.; Singh, S.; Guan, K.; Ocampo, R. F.; Ribeiro da Silva, S. J.; Wachholz Junior, D.; Emami, N.; Corsano, A.; Zeisler, I.; Bozovicar, K.; Wang, L.; Ham, D.; Cai, R.; Kelly, P.; Zayeni, R.; Nguyen, J.; Bayat, P.; Charania, M.; Palter, S.; Liu, F. X.; Shrestha, S.; Rayhan, A.; Wasney, G. A.; Mazzulli, T.; Green, A. A.; Li, Z.; Yao, S.; Hubbard, B. P.; Taylor, D. W.; Pardee, K.

2026-07-16 primary care research 10.64898/2026.07.14.26358058 medRxiv

Top 15%

0.1%

Show abstract

CRISPR-Cas12a nucleases are classically activated through CRISPR RNA (crRNA) guided and PAM-dependent target recognition, which together establish a canonical heteroduplex associated with nuclease activation. Here we identify a crRNA- and PAM-independent activation pathway for Cas12a that reveals previously unrecognized conformational plasticity within its nucleic acid recognition interface. We show that short RNAs can directly occupy the canonical crRNA-binding channel and trigger a catalytically competent trans cleavage state in the absence of PAM recognition or canonical R-loop formation. Biochemical assays indicate that short RNAs bind the crRNA-binding channel and are competitively displaced by cognate crRNA, consistent with binding at a conserved nucleic acid-binding interface. Cryo-electron microscopy (cryo-EM) further reveals that Cas12a maintains its global catalytic architecture while exhibiting loss of canonical PAM-dependent stabilization and increased flexibility of the RuvC lid, alongside accommodation of a noncanonical RNA-DNA hybrid with inverted polarity relative to the crRNA-target duplex. This crRNA-independent activation pathway enables programmable, amplification-free detection of DNA and RNA targets independent of canonical guide-mediated recognition. Together, these findings define an alternative activation geometry for Cas12a and expand models of Class 2 CRISPR-Cas effector activation beyond crRNA- and PAM-directed recognition.

13

FoodScribe: an open-source semantic framework for nutrient estimation from free-text dietary records

Gouda, H.; Sala Climent, M.; Agongo, J.; Gaikwad, S. P.; Nattakom, A.; Zhao, H. N.; Xing, S.; Boland, B. S.; Holt, T.; Guma, M.; Dorrestein, P. C.

2026-07-17 nutrition 10.64898/2026.07.15.26358181 medRxiv

Top 15%

0.1%

Show abstract

Efficiently summarizing dietary records at scale remains a persistent bottleneck in nutritional epidemiology. We present FoodScribe, which translates free-text meal descriptions into quantitative nutrient profiles by combining ingredient parsing with nutrient retrieval by querying the USDA FoodData Central (FDC) database. Benchmarked using three LLM providers using Nutribench dataset, FoodScribe completed annotation of 3,807 meal descriptions in 2.5 hours, a task otherwise requiring substantial manual effort from trained nutritionists. FoodScribe achieved accuracy across macronutrient estimation (F1=0.79-0.89), with models performing better for protein than fat estimation. Application to a Mediterranean diet intervention cohort indicated dietary shifts consistent with the intervention pattern based on model-derived estimates. Integration with metabolomics data suggested that fiber and vegetable intake were positively associated with a fecal metabolite cluster.

14

Statistical Inference and Power Analysis for Comparative F1 and Fβ Scores under Correlated Classifier Pairs

Hsu, C.-Y.; Liu, Q.; Shyr, Y.

2026-07-17 dermatology 10.64898/2026.07.15.26358166 medRxiv

Top 17%

0.1%

Show abstract

As machine learning and artificial intelligence systems are increasingly used in healthcare, rigorous evaluation of their classification performance has become critical. The F1 and F{beta} scores are widely adopted metrics for assessing performance in imbalanced biomedical data. Recently, we introduced psF1, a unified statistical framework for inference and study design for single and comparative F1 and F{beta} scores under the assumption of independent classifiers. In practice, however, benchmarking two classifiers on the same dataset creates a correlated paired setting. Ignoring this intrinsic dependency leads to overestimation of the standard error and a substantial loss of statistical power. To address this, we develop psF1pair, an advanced framework for statistical inference and power analysis that explicitly accounts for correlations between classifier pairs. Extensive simulation studies demonstrate the performance of psF1pair, and its utility is further illustrated through application to a real-world imaging classification system. As expected, higher correlation between classifiers yields narrower confidence intervals and enhanced statistical power. A freely available R package is provided to facilitate implementation, supporting accurate evaluation and study design for predictive and classification models in biomedical research.

15

Transducin: an open-source pipeline recovering SNOMED-CT coded measurements from the undocumented Optopol .OPT and Zeiss Cirrus private-tag formats as DICOM Structured Reports

Jaurrieta Hinojos, J. N.; Palomares Ordonez, J. L.; Chacon Hinojos, J. F.; Folgueras Batres, M. A.

2026-07-17 ophthalmology 10.64898/2026.07.14.26357256 medRxiv

Top 19%

0.1%

Show abstract

Abstract Background. Quantitative optical coherence tomography (OCT) measurements are essential for retinal disease monitoring, yet leading vendors store acquisition data in undocumented proprietary formats or encode measurements exclusively in private DICOM tags inaccessible to open systems. Methods. We present Transducin, an open-source Python library that reverse-engineers the undocumented Optopol Revo FC130 and Revo 60 .OPT binary format and extracts quantitative measurements from Zeiss Cirrus HDOCT private DICOM tags, generating TID 1500 Structured Reports with SNOMEDCT coded findings for both platforms. A novel finding, that OCTPARAMS tag 23 encodes ocular laterality through the arithmetic sign of the foveal horizontal position, enables geometry based laterality inference requiring no operator data entry, validated across 18 files from two device models and four software versions with 100% accuracy. Results. The primary corpus of 452 Optopol .OPT files (73 patients, 7 acquisition types) was parsed with 100% success. Cross-version compatibility was confirmed across SOCT versions 11.5.0 through 21.5.0, spanning approximately eight years of software development. The Zeiss Cirrus pipeline generated TID 1500 SRs for all 41 applicable studies (100%), yielding CMT 203to 630um and RNFL 53 to123 um across a clinically representative range. Conclusions. Transducin provides the first publicly documented specification of the Optopol .OPT format and the first open-source multivendor pipeline generating SNOMEDCT coded DICOM Structured Reports from both Optopol Revo and Zeiss Cirrus devices, closing a gap explicitly confirmed by both manufacturers' own documentation. The code is available at https://github.com/oftalmos-org/transducin (Apache License 2.0).

16

The Registry of Pregnant Women at Cruces University Hospital: an ethical framework for prospective research with preanalytical optimization of maternal plasma processing

Gonzalez-Moro, I.; Sanchez-Garcia, H.; Medina Cuesta, T.; Rodriguez Lirio, A.; Espin Lopez, M. d. P.; Esquivel Gonzalez, S.; Quintana Ochoa de Alda, E.; de la Pena-Sanz, M.; Marin Cano, L.; Sarasua-Blanco, N.; Ortiz Salinas, P.; Sanfeliu Padulles, A.; Ruiz Adrian, A.; Martinez Isidoro, A.; Aldaiturriaga Otaola, A.; Aramburu Gil, A.; Garcia Gil, A.; Saenz Saenz, A.; Heredia Campos, A.; Fernandez Salado, A.; Ramirez Jarana, A. I.; Tobar Lopez, A. I.; Casarojos Oses, A. J.; Martinez de Maranon Toral, A.; Satiago Hidalgo, A.; Silva Diaz, A.; Basterrechea Miguel, A.; Castanos Lasa, A.; Esteras Vadi

2026-07-17 obstetrics and gynecology 10.64898/2026.07.17.26357942 medRxiv

Top 19%

0.0%

Show abstract

Background: Prospective pregnancy registries and biobanking infrastructures are essential for future translational studies investigating maternal, placental and offspring health. However, circulating nucleic acid analyses are highly sensitive to preanalytical variability, particularly regarding blood-collection tube type and sample processing conditions. We established a prospective pregnancy registry and biobanking workflow at Cruces University Hospital and evaluated the impact of preanalytical variables on circulating cell-free DNA (cfDNA) and cell-free RNA (cfRNA) preservation in maternal plasma collected at delivery. Methods: The Registry of Pregnant Women at Cruces University Hospital was designed as a prospective infrastructure integrating placental sampling, maternal blood collection and ethically controlled future access to maternal and offspring clinical data. Within this framework, peripheral blood samples from 50 women at delivery were simultaneously collected into EDTA, Norgen and Roche tubes. Plasma samples processed within or after 24 hours following collection underwent cfDNA/cfRNA extraction, electrophoretic profiling, fluorometric quantification and RT-qPCR analyses targeting different stress-related genes. Results: By the end of June 2026, 1,127 women had been prospectively recruited into the registry, with 661 plasma samples, 637 serum samples and 858 sets of four placental biopsies collected, processed and stored in the Basque Biobank. In the preanalytical substudy, EDTA tubes yielded higher cfDNA concentrations, likely reflecting reduced cellular preservation and genomic DNA contamination. In contrast, Roche tubes showed superior cfRNA preservation, with higher cfRNA concentrations and more consistent detection of the characteristic 5S rRNA peak compared with EDTA and Norgen tubes. Processing delays beyond 24 hours reduced cfRNA concentration, while associations between circulating transcripts and gestational age were more consistently detectable in preservative-containing tubes. Conclusions: Prospective infrastructures like ours offer strong foundation for large scale, long-term studies in the framework of the Developmental Origins of Health and Disease hypothesis. Technically, Roche tubes provided superior cfRNA preservation and enhanced sensitivity for detecting subtle biological associations, supporting the importance of standardized preanalytical workflows within prospective pregnancy biobanking resource.

17

Prescribing Trends of Antimicrobials in Obstetric and Gynaecological Inpatients: A Prospective Drug Utilization Study with Concurrent Antimicrobial Stewardship Audit from a Tertiary Care Hospital in Karachi, Pakistan

Ansari, T.; Zehra, A.; Jabbar, S.; Fatima, M.; Syed, B.; Shah, S. S. A. M.; Ahmed, A. S.; Hamid, A.; Ashafaq, H.

2026-07-17 obstetrics and gynecology 10.64898/2026.07.16.26358229 medRxiv

Top 19%

0.0%

Show abstract

Background: Antimicrobial resistance (AMR) disproportionately affects low- and middle-income countries (LMICs) such as Pakistan, where obstetric and gynaecological (OBGYN) patients carry high antibiotic exposure. Specialty-specific drug utilization data with concurrent stewardship audit remain scarce. This study evaluated antibiotic prescribing patterns, consumption metrics, and antimicrobial stewardship program (AMS) compliance in OBGYN inpatients at a public sector tertiary care hospital. Methods: A prospective cross-sectional study was conducted in OBGYN wards of Dow University Hospital, Karachi, from 1 September to 31 October 2025. Women receiving [≥]1 systemic antibiotic were included. Daily AMS rounds were conducted by an Infectious Diseases physician and pharmacist. Antibiotic consumption was measured as Defined Daily Doses (DDD) and Days of Therapy (DOT) per 1,000 patient-days (total = 821). Antibiotics were classified by WHO AWaRe (2023) framework. Results: Of 812 total admissions, 278 patients (34.2%) received [≥]1 antibiotic and were enrolled (205 obstetric, 73 gynaecological), generating 636 prescriptions (mean 2.29/patient). Surgical prophylaxis was the predominant documented indication (213, 33.5%); 65.1% carried no documented indication. By AWaRe classification, 53.6% were Access-group and 46.1% Watch-group. Ceftriaxone (38.4%) and metronidazole (36.8%) together represented 75.2% of prescriptions. Combined DDD/1,000 patient-days was 1,758.6 and DOT/1,000 patient-days was 1,852.7. AMS compliance was 0%. Conclusions: This study documents high antibiotic prescribing burden, near-universal documentation failure, and zero AMS compliance in OBGYN inpatients at a Pakistani public sector hospital. The predominance of Watch-group antibiotics and undocumented surgical prophylaxis highlights structural stewardship gaps. Findings support urgent need for institutional OBGYN antibiotic guidelines and structured pharmacist-led AMS programs.

18

Neonatal admission as a marker of risk for poor educational attainment and special educational needs in children aged 5-11 years

John, A.; Pike, C.; Olga, L.; Sovio, U.; Wong, H. S.; Smith, G. C.; Aiken, C.

2026-07-17 pediatrics 10.64898/2026.07.15.26358132 medRxiv

Top 19%

0.0%

Show abstract

Background: Children born prematurely (before 37 weeks) or admitted to the neonatal unit (NNU) are at increased risk of adverse long-term physical health outcomes. It is also recognised that there is an association with later academic performance and special educational needs, however it is not clear whether these broad risk factors could be used as stand-alone heuristics to identify children who may benefit from additional support in educational settings. We aimed to examine the associations between neonatal unit (NNU) admission and educational attainment in mid-childhood. Methods and Findings: Pregnancy data from a prospective birth cohort (Pregnancy Outcome Prediction Study, Cambridge, United Kingdom, 2008-2012) were linked to national educational outcomes (Department for Education, United Kingdom). Multivariable regression models adjusted for maternal, child, and socioeconomic factors were used to evaluate associations between (i) all NNU admissions, (ii) at term NNU admissions >48 hours, (iii) preterm birth without ongoing physical health needs, and educational outcomes at ages 5-11 years. Children who required any NNU care were more likely not to meet expected educational standards across multiple ages and domains in early and mid-childhood: age 5 early year foundation (aOR 1.64, 95% CI 1.19-2.27, p=0.003), phonics at age 6 (aOR 2.43, 95% CI 1.72-3.57, p<0.001), and at age 7 (here assessments were divided into multiple domains): reading (aOR 1.67, 95% CI 1.18-2.38, p=0.004), writing (aOR 1.72, 95% CI 1.25-2.38, p<0.001), mathematics (aOR 1.56, 95% CI 1.09-2.22, p=0.020), and science (aOR 1.85, 95% CI 1.22-2.78, p=0.003). Similar patterns were observed among both at term-born infants who stayed >48hrs in NNU (phonics assessment at age 6 aOR 2.26, 95% CI 1.51-3.36, p<0.001) and in children born preterm without long-term physical health sequelae (phonics assessment at age 6 aOR 3.07, 95% CI 1.96-4.81, p<0.001). These associations were robust to adjustment for demographic, perinatal, and socio-economic factors. By age 11, differences in academic attainment were attenuated and no longer clearly distinguishable across all exposure groups. However, there was an increased likelihood of special educational needs (SEN) at age 11 associated with any NNU admission (aOR 1.78, 95% CI 1.15-2.73, p=0.009), at term NNU admission for >48hrs (aOR 1.88, 95% CI 1.19-3.00, p=0.007), and children born preterm without long-term physical health sequelae (aOR 1.50, 95% CI 1.00-2.25, p=0.049). Predictive performance of any NNU admission for SEN at age 11 was moderate (AUC 0.70, 95% CI: 1.14-2.65, p=0.010), with balanced sensitivity and specificity and high negative predictive value. Conclusions: NNU admission, for both term and preterm infants, is associated with poorer educational outcomes and an increased likelihood of special educational needs in mid-childhood.

19

Bridging surveillance gaps in dengue: a hierarchical model integrating mixed data sources for transmission estimation and vaccine targeting

Djaafara, B. A.; Elyazar, I. R.; Yosephine, P.; Surya, A.; Silalahi, F. S.; Handito, A.; Thohir, B.; Aryani, D.; Gunawan, D.; Nisa, A. K.; Prianto, E.; Samad, I.; Cook, A. R.; Huang, A. T.; Clapham, H. E.; Bhatt, S.; Mishra, S.

2026-07-17 epidemiology 10.64898/2026.07.15.26358208 medRxiv

Top 19%

0.0%

Show abstract

Estimating dengue force of infection (FOI) is essential for understanding transmission dynamics and targeting intervention programmes, yet surveillance data in endemic settings required for estimations are often incomplete, with varying formats. We developed a Bayesian hierarchical catalytic model that jointly fits age-stratified case data, aggregate case data, and seroprevalence surveys within a single framework, incorporating external covariates to improve parameter identifiability. Synthetic validation showed that covariates alone recovered accurate FOI point estimates even when most districts contributed only aggregate data, but did so with poorly calibrated uncertainty; anchoring the model with a single seroprevalence survey was necessary to bring credible interval coverage close to nominal. Applied to 128 districts across Java and Bali, Indonesia (2016-2024), the model revealed substantial spatial heterogeneity in FOI and reporting rates. Many districts in Java exceeded the WHO-suggested seroprevalence threshold for vaccine introduction, yet were classified as low-priority when using reported incidence as prioritisation criterion, particularly in areas with weak surveillance. Model-based seroprevalence estimation, integrating multiple data sources, offers a more consistent basis for identifying high-priority districts for vaccine introduction, and is less susceptible to surveillance bias than reported incidence.

20

Genetic sensitivity analysis: estimating genetic confounding and environmentally mediated genetic effects using multiple exposures

Frach, L.; Rijsdijk, F.; Hannigan, L. J.; Dudbridge, F.; Pingault, J.-B.

2026-07-17 epidemiology 10.64898/2026.07.16.26358236 medRxiv

Top 19%

0.0%

Show abstract

Polygenic scores are imperfect measures of the additive genetic effects of common genetic variants. The resulting measurement error biases estimates of quantities of interest in epidemiological analyses integrating polygenic scores. For example, how much of an exposure-outcome association is genetically confounded can be substantially underestimated when using polygenic scores alone. Here we present extensions to Gsens, a genetic sensitivity analysis, which aims to correct for such measurement error using both polygenic scores and heritability estimates. Gsens now allows for multiple exposures and estimates several quantities of interest, i.e. genetic confounding, adjusted residual association (net of genetic confounding), genetic overlap and environmentally mediated genetic effects. We present derivations and simulations showing how Gsens accounts for measurement error in the polygenic score; we also show how estimation may be affected by misspecifications of the causal structure between exposures. Applying Gsens in the Norwegian Mother, Father and Child Cohort Study (MoBa), we uncover, among other results, substantial genetic confounding in the associations between multiple known risk factors for attention deficit hyperactivity disorder (ADHD), such as low birth weight and temperament, and measures of ADHD in childhood. The updated Gsens R package offers multiple options, including for missing data handling and customisable syntax. Our extended version of Gsens is applicable to a broad range of substantive questions in multiple disciplines.